Goto

Collaborating Authors

 proposal method





Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing Systems

We sincerely thank all reviewers and AC for their time and effort. Our paper proposes a method for learning object proposals as a ConvNet that shares features/computation with a Fast R-CNN detection network, leading to real-time detection rates with state-of-the-art accuracy. We are glad that the reviews are consistently positive, and we appreciate the constructive suggestions for improving our paper. The reviewers noted that "the paper will have real practical impact in object detection" (R4), "the experimental results are solid" (R1, R5), and the paper "will be very interesting to the computer vision community as the results are impressive" (R6). We address the reviewer's concerns and questions as follows.


Reviews: Tree-Structured Reinforcement Learning for Sequential Object Localization

Neural Information Processing Systems

I liked the ideas present in the paper. This is the first sequential search strategy I have seen that tries to output all objects in an image, and does not impose any arbitrary and hard-to-justify order on the boxes during training. I have a minor clarification, and then some ways of strengthening the experimental section, which to me would be the difference between a poster and an oral. A point of clarification: It seems that the ranking of the proposals output is simply the depth of the tree at which they are discovered. Why is this a good ranking?


Learning to Segment Object Candidates

Neural Information Processing Systems

Recent object detection systems rely on two critical steps: (1) a set of object proposals is predicted as efficiently as possible, and (2) this set of candidate proposals is then passed to an object classifier. Such approaches have been shown they can be fast, while achieving the state of the art in detection performance. In this paper, we propose a new way to generate object proposals, introducing an approach based on a discriminative convolutional network. Our model is trained jointly with two objectives: given an image patch, the first part of the system outputs a class-agnostic segmentation mask, while the second part of the system outputs the likelihood of the patch being centered on a full object. At test time, the model is efficiently applied on the whole test image and generates a set of segmentation masks, each of them being assigned with a corresponding object likelihood score. We show that our model yields significant improvements over state-of-theart object proposal algorithms. In particular, compared to previous approaches, our model obtains substantially higher object recall using fewer proposals. We also show that our model is able to generalize to unseen categories it has not seen during training. Unlike all previous approaches for generating object masks, we do not rely on edges, superpixels, or any other form of low-level segmentation.


Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Neural Information Processing Systems

State-of-the-art object detection networks depend on region proposal algorithms to hypothesize object locations. Advances like SPPnet [7] and Fast R-CNN [5] have reduced the running time of these detection networks, exposing region proposal computation as a bottleneck. In this work, we introduce a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals. An RPN is a fully-convolutional network that simultaneously predicts object bounds and objectness scores at each position. RPNs are trained end-to-end to generate highquality region proposals, which are used by Fast R-CNN for detection. With a simple alternating optimization, RPN and Fast R-CNN can be trained to share convolutional features. For the very deep VGG-16 model [19], our detection system has a frame rate of 5fps (including all steps) on a GPU, while achieving state-of-the-art object detection accuracy on PASCAL VOC 2007 (73.2% mAP) and 2012 (70.4% mAP) using 300 proposals per image.


Likelihood-Free Gaussian Process for Regression

Shikuri, Yuta

arXiv.org Machine Learning

Gaussian process regression can flexibly represent the posterior distribution of an interest parameter given sufficient information on the likelihood. However, in some cases, we have little knowledge regarding the probability model. For example, when investing in a financial instrument, the probability model of cash flow is generally unknown. In this paper, we propose a novel framework called the likelihood-free Gaussian process (LFGP), which allows representation of the posterior distributions of interest parameters for scalable problems without directly setting their likelihood functions. The LFGP establishes clusters in which the value of the interest parameter can be considered approximately identical, and it approximates the likelihood of the interest parameter in each cluster to a Gaussian using the asymptotic normality of the maximum likelihood estimator. We expect that the proposed framework will contribute significantly to likelihood-free modeling, particularly by reducing the assumptions for the probability model and the computational costs for scalable problems.


Machine learning based co-creative design framework

Quanz, Brian, Sun, Wei, Deshpande, Ajay, Shah, Dhruv, Park, Jae-eun

arXiv.org Artificial Intelligence

We propose a flexible, co-creative framework bringing together multiple machine learning techniques to assist human users to efficiently produce effective creative designs. We demonstrate its potential with a perfume bottle design case study, including human evaluation and quantitative and qualitative analyses.


FastMask: Segment Multi-scale Object Candidates in One Shot

Hu, Hexiang, Lan, Shiyi, Jiang, Yuning, Cao, Zhimin, Sha, Fei

arXiv.org Artificial Intelligence

Objects appear to scale differently in natural images. This fact requires methods dealing with object-centric tasks (e.g. object proposal) to have robust performance over variances in object scales. In the paper, we present a novel segment proposal framework, namely FastMask, which takes advantage of hierarchical features in deep convolutional neural networks to segment multi-scale objects in one shot. Innovatively, we adapt segment proposal network into three different functional components (body, neck and head). We further propose a weight-shared residual neck module as well as a scale-tolerant attentional head module for efficient one-shot inference. On MS COCO benchmark, the proposed FastMask outperforms all state-of-the-art segment proposal methods in average recall being 2~5 times faster. Moreover, with a slight trade-off in accuracy, FastMask can segment objects in near real time (~13 fps) with 800*600 resolution images, demonstrating its potential in practical applications. Our implementation is available on https://github.com/voidrank/FastMask.